Canine gene microarrays

ABSTRACT

The present invention is based on the identification of novel canine nucleic acid sequences and the construction of canine microarrays containing a significant portion of the canine genome. The microarrays specifically hybridize to canine nucleic acid samples and may be used in drug screening and toxicity assays.

RELATED APPLICATIONS

This application claims priority under 35 U.S.C. §119(e) to U.S.Provisional Application 60/377,240, filed May 3, 2002, which is hereinincorporated by reference in its entirety.

SEQUENCE LISTING SUBMISSION ON COMPACT DISC

The Sequence Listing submitted concurrently herewith on compact discunder 37 C.F.R. §§1.821(c) and 1.821(e) is herein incorporated byreference in its entirety. Four copies of the Sequence Listing, one oneach of four compact discs are provided. Copy 1, Copy 2 and Copy 3 areidentical. Copies 1, 2 and 3 are also identical to the CRF. Eachelectronic copy of the Sequence Listing was created on May 2, 2002 witha file size of 8868 KB. The filenames are as follows: Copy1-g15116wo.txt; Copy2-g15116wo.txt; Copy 3-g15116wo.txt;CRF-g15116wo.txt.

BACKGROUND OF THE INVENTION

The need for methods of assessing the impact, including toxicity, of acompound, pharmaceutical agent or environmental pollutant on a cell orliving organism has led to the development of procedures which utilizeliving organisms as biological monitors. The simplest and mostconvenient of these systems utilize unicellular microorganisms such asyeast and bacteria, since they are most easily maintained andmanipulated. Unicellular screening systems also often use easilydetectable changes in phenotype to monitor the effect of test compoundson the cell. Unicellular organisms, however, are inadequate models forestimating the potential effects of many compounds on complexmulticellular animals, as they do not have the ability to carry outbiotransformations to the extent or at levels found in higher organisms.

The biotransformation of chemical compounds by multicellular organismsis a significant factor in determining the effects, including toxicity,of agents to which they are exposed. Accordingly, multicellularscreening systems may be preferred or required to detect the toxiceffects of compounds. The use of multicellular organisms as screeningtools has been significantly hampered, however, by the lack ofconvenient screening mechanisms or endpoints, such as those available inyeast or bacterial systems. In an attempt to compensate for thedeficiencies of single cell testing systems, animal models using smalllaboratory species such as rats and mice have been developed. Suchmodels, however, do not always provide an accurate picture of cellularresponses induced in higher mammals such as humans. Accordingly, higherorder mammals such as dogs are often required in the later stages ofpharmaceutical testing or in testing the biological effects of known orpotential toxins.

In addition, safety guidelines in the pharmaceutical, food and chemicalindustries in many countries require pre-clinical toxicity testing ofevery product in at least two species, one rodent species, usually therat, and one non-rodent species, usually the dog (Smith et al., Lab Anim35(2):117-130 (2001); Broadhead et al., Hum Exp Toxicol 19(8):440-447(2000); Zbinden, Regul Toxicol Pharmacol 17(1):85-94 (1993)). accordancewith legal requirements for acute and repeated-dose toxicity testing,large-scale studies are usually undertaken, entailing the use of manydogs. Although primates, such as macaques and marmosets, may also beused as the non-rodent, large animal species, it is likely that the dogwill remain the principal large animal used in testing.

There have been recent attempts in the pharmaceutical industry toredesign pre-clinical testing, so that fewer animals can be used and sothat their use is more targeted. Because toxicity data from testing indogs is known to be predictive for humans, testing in dogs, however,cannot be eliminated.

Thus, there is a need for sensitive and rapid methods of detectingcellular responses and differential gene expression in animal models inresponse to therapeutic agents, particularly methods that canaccommodate large numbers of samples. Techniques employing microarrays,especially microarrays containing a high percentage of a large animal'sgenome (such as a dog's) are, therefore, likely to be the most useful inproviding information about responses to therapeutic agents or toxinsthat would be seen in other large animals, such as humans.

SUMMARY OF THE INVENTION

The present invention includes a set of cDNA sequences representative ofthe expressed genome of a dog. The present invention also includesmicroarrays containing probes that hybridize to mRNA sequencescorresponding to the canine genes. The sequences on these microarraysrepresent a large portion of the canine genome, and these microarraysare capable of detecting changes in gene expression level in a largepercentage of canine genes.

Additionally, the present invention includes methods of using themicroarray chips to detect or monitor changes in gene expression in atissue or cell sample, such as a toxic response in dogs after exposureof the dogs to a known toxin or to a compound with unknown toxicproperties. The microarray chips are capable of detecting up- ordown-regulation of a large percentage of the genes in the canine genomefollowing exposure of the animal to a known or unknown toxin, and aprofile of the genes that are up- and/or down-regulated can be produced.Genes within the profile can be selected as marker genes and theirexpression level determined in subjects undergoing toxicity responsetesting. The methods of the present invention may also be used to detectgenes that are up- or down-regulated in canines in a disease state. Aprofile of these genes may then be produced, and marker genes may beidentified. Expression levels of these genes may be used in theidentification and monitoring of diseases in canines. In addition,expression levels of genes identified as marker genes may be used todetect and monitor a positive or negative response to a medical orpharmaceutical treatment.

The present invention also includes a computer system comprising adatabase of the genes and gene fragments herein described, in which thedatabase also includes information identifying the expression level ofgenes in at least one tissue or cell sample, such as normal andtoxin-exposed canine tissues. The database may also include descriptiveinformation from external databases. Further, the present inventionincludes methods of using the computer system to present informationcomparing the expression level of the genes in the database in normaland in toxin-exposed tissues and cells.

Finally, the present invention includes kits comprising the caninemicroarrays, along with sequence information and gene expressioninformation regarding the gene expression levels in at least one tissueor cell sample.

DETAILED DESCRIPTION

Many biological functions are accomplished by altering the expression ofvarious genes through transcriptional (e.g. through control ofinitiation, provision of RNA precursors, RNA processing, etc.) and/ortranslational control. For example, fundamental biological processessuch as cell cycle, cell differentiation and cell death are oftencharacterized by the variations in the expression levels of groups ofgenes.

Changes in gene expression are also associated with the effects ofvarious chemicals, drugs, toxins, pharmaceutical agents and pollutantson an organism or cells. For example, the lack of sufficient expressionof functional tumor suppressor genes and/or the over expression ofoncogene/protooncogenes after exposure to all agent could lead totumorgenesis or hyperplastic growth of cells (Marshall, Cell, 64:313-326 (1991); Weinberg, Science, 254:1138-1146 (1991)). Thus, changesin the expression levels of particular genes (e.g. oncogenes or tumorsuppressors) may serve as signposts for the presence and progression oftoxicity or other cellular responses to exposure to a particularcompound.

Monitoring changes in gene expression may also provide certainadvantages during drug screening and development. Often drugs arescreened for the ability to interact with a major target without regardto other effects the drugs have on cells. These cellular effects maycause toxicity in the whole animal, which prevents the development andclinical use of the potential drug.

The present invention is based, in part, on the identification of newcanine genes, including new canine genes that are expressed in one ormore tissues, such as liver, kidney, heart, brain and testicular tissue.These genes correspond to the canine cDNA of SEQ ID NOS: 1-11,109.

The genes of the invention may be used as diagnostic agents or markersto detect a cellular response in a sample individually or as part of agene expression profile. They can also serve as a target for agents thatmodulate gene expression or activity. For example, agents may beidentified that modulate gene expression levels as a means of modulatingaberrant biological processes associated with a cellular response, suchas inflammation, cytotoxicity, hyperplastic growth or disruption of thecell cycle.

Nucleic Acid Molecules

The present invention provides nucleic acid molecules corresponding tothe genes or sequences described herein, preferably in isolated form. Asused herein, “nucleic acid” includes RNA or DNA that comprises any oneof SEQ ID NOS:1-11,109, is complementary to any of these sequences,specifically hybridizes to a nucleic acid of SEQ ID NOS: 1-11,109 andremains stably bound to it under appropriate stringency conditions,and/or exhibits greater than about 90% or 95% or more nucleotidesequence identity through greater than about 90% or 95% of the sequencelength of SEQ ID NOS: 1-11,109.

Specifically contemplated are genomic DNA, cDNA, mRNA and antisensemolecules, as well as nucleic acids based on alternative backbones orincluding alternative bases, whether derived from natural sources orsynthesized. Such hybridizing or complementary nucleic acids, however,are defined further as being novel and unobvious over any prior artnucleic acid including that which encodes, hybridizes under appropriatestringency conditions, or is complementary to nucleic acid encoding aprotein according to the present invention.

Homology or identity at the nucleotide or amino acid sequence level isdetermined by BLAST (Basic Local Alignment Search Tool) analysis usingthe algorithm employed by the programs blastp, blastn, blastx, tblastnand tblastx (Altschul S. F. et al., Nucleic Acids Res 25:3389-3402(1997), and Karlin et al., Proc Natl Acad Sci USA 87:2264-2268 (1990),both fully incorporated by reference) which are tailored for sequencesimilarity searching. The approach used by the BLAST program is to firstconsider similar segments, with and without gaps, between a querysequence and a database sequence, then to evaluate the statisticalsignificance of all matches that are identified and finally to summarizeonly those matches which satisfy a pre-selected threshold ofsignificance. For a discussion of basic issues in similarity searchingof sequence databases, see Altschul et al., Nature Genetics 6:119-129(1994), which is fully incorporated by reference. The search parametersfor histogram, descriptions, alignments, expect (i.e., the statisticalsignificance threshold for reporting matches against databasesequences), cutoff, matrix and filter (low complexity) are at thedefault settings. The default scoring matrix used by blastp, blastx,tblastn, and tblastx is the BLOSUM62 matrix (Henikoff et al., Proc NatlAcad Sci USA 89: 10915-10919 (1992), fully incorporated by reference),recommended for query sequences over 85 nucleotides or amino acids inlength.

For blastn, the scoring matrix is set by the ratios of M (i.e., thereward score for a pair of matching residues) to N (i.e., the penaltyscore for mismatching residues), wherein the default values for M and Nare 5 and −4, respectively. Four blastn parameters were adjusted asfollows: Q=10 (gap creation penalty); R=10 (gap extension penalty); Evalue=10 (expected number of matches in the sequence database(s) purelyby chance based on a random sequence model; word size=11 The equivalentBlastp parameter settings were Q=9; R=2; wink=1; and gapw=32. A Bestfitcomparison between sequences, available in the GCG package version 10.0,uses DNA parameters GAP=50 (gap creation penalty) and LEN=3 (gapextension penalty) and the equivalent settings in protein comparisonsare GAP=8 and LEN=2.

“Stringent conditions” are those that (1) employ low ionic strength andhigh temperature for washing, for example, 0.015 M NaCl/0.0015 M sodiumcitrate/0.1% SDS at 50° C., or (2) employ during hybridization adenaturing agent such as formamide, for example, 50% (vol/vol) formamidewith 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50mM sodium phosphate buffer at pH 6.5 with 750 MM NaCl, 75 mM sodiumcitrate at 42° C. Another example is hybridization in 50% formamide, 5SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH6.8), 0.1% sodium pyrophosphate, 5 Denhardt's solution, sonicated salmonsperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42° C., withwashes at 42° C. in 0.2 SSC and 0.1% SDS. A skilled artisan can readilydetermine and vary the stringency conditions appropriately to obtain aclear and detectable hybridization signal. When hybridizing anoligonucleotide to MRNA or cRNA from a cell sample, the “stringentconditions” under which the oligonucleotide or probe specifically bindsto a nucleic acid molecule of the invention can be calculated by one ofordinary skill in the art (see below).

As used herein, a nucleic acid molecule is said to be “isolated” whenthe nucleic acid molecule is substantially separated from contaminantnucleic acid molecules encoding other polypeptides.

The present invention further includes fragments of the nucleic acidmolecules as herein described, e.g. hybridization probes oroligonucleotides. As used herein, a fragment of a nucleic acid moleculerefers to a small portion of a sequence as herein described. The size ofthe fragment will be determined by the intended use. For example, if thefragment is chosen so as to encode a protein or an active portion of theprotein, the fragment will need to be large enough to encode the fullprotein or the functional region(s) of the protein. For instance,fragments which encode peptides corresponding to predicted antigenicregions may be prepared. If the fragment is to be used as a nucleic acidprobe or PCR primer, then the fragment length is chosen so as to obtaina relatively small number of false positives during probing/priming.

Fragments of the nucleic acid molecules of the present invention (i.e.,synthetic oligonucleotides) that are used as probes or specific primersfor the polymerase chain reaction (PCR), or to synthesize gene sequencesencoding proteins, can easily be synthesized by chemical techniques, forexample, the phosphoramidite method of Matteucci et al. (J Am Chem Soc103:3185-3191 (1981)) or-using automated synthesis methods. In addition,larger DNA segments can readily be prepared by well known methods, suchas synthesis of a group of oligonucleotides that define various modularsegments of the gene, followed by ligation of oligonucleotides to buildthe complete modified gene.

The nucleic acid molecules of the present invention may further bemodified so as to contain a detectable label for diagnostic and probepurposes. A variety of such labels are known in the art and can readilybe employed with the encoding molecules herein described. Suitablelabels include, but are not limited to, biotin, radiolabeled nucleotidesand the like. A skilled artisan can readily employ any such label toobtain labeled variants of the nucleic acid molecules of the invention.

rDNA Molecules Containing a Nucleic Acid Molecule

The present invention further provides recombinant DNA molecules (rDNAs)that comprise any one of SEQ ID NOS: 1-11,109. As used herein, a rDNAmolecule is a DNA molecule that has been subjected to molecularmanipulation in situ. Methods for generating rDNA molecules are wellknown in the art, for example, see Sambrook et al., Molecular Cloning—ALaboratory Manual, 3d Ed., Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y., 2001. In the preferred rDNA molecules, a DNAsequence is operably linked to replication or expression controlsequences and/or vector sequences.

The choice of control sequences to which one of the sequences of thepresent invention is operably linked depends directly, as is well knownin the art, on the functional properties desired, e.g., proteinexpression, replication requirements and the host cell to betransformed. A vector contemplated by the present invention is at leastcapable of directing the replication or insertion into the hostchromosome, and, in certain cases, expression, of the structural geneincluded in the rDNA molecule.

Expression control elements that are used for regulating the expressionof an operably linked protein encoding sequence are known in the art andinclude, but are not limited to, inducible promoters, constitutivepromoters, secretion signals, and other regulatory elements. Preferably,the inducible promoter is readily controlled, such as being responsiveto a nutrient in the host cell's medium.

In one embodiment, the vector containing a coding nucleic acid moleculewill include a prokaryotic replicon, i.e., a DNA sequence having theability to direct autonomous replication and maintenance of therecombinant DNA molecule extrachromosomally in a prokaryotic host cell,such as a bacterial host cell, transformed therewith. Such replicons arewell known in the art. In addition, vectors that include a prokaryoticreplicon may also include a gene whose expression confers a detectablemarker such as a drug resistance. Typical bacterial drug resistancegenes are those that confer resistance to ampicillin or tetracycline.

Vectors that include a prokaryotic replicon can further include aprokaryotic or bacteriophage promoter capable of directing theexpression (transcription and translation) of the coding gene sequencesin a bacterial host cell, such as E. coli. A promoter is an expressioncontrol element formed by a DNA sequence that permits binding of RNApolymerase and transcription to occur. Promoter sequences compatiblewith bacterial hosts are typically provided in plasmid vectorscontaining convenient restrict ion sites for insertion of a DNA segmentof the present invention. Typical of such vector plasmids are pUC8,pUC9, pBR322 and pBR329 available from BioRad Laboratories, (Richmond,Calif.), pPL and pKK223 available from Pharmacia (Piscataway, N.J.).

Expression vectors compatible with eukaryotic cells, preferably thosecompatible with vertebrate cells, such as canine cells, can also be usedto form rDNA molecules that contain a coding sequence. Eukaryotic cellexpression vectors, including viral vectors, are well known in the artand are available from several commercial sources. Typically, suchvectors are provided containing convenient restriction sites forinsertion of the desired DNA segment. Typical of such vectors are pSVLand pKSV-10 (Pharmacia), pBPV-1/pML2d. (International Biotechnologies,Inc.), pTDT1 (ATCC, #31255), the vector pCDM8 described herein, and thelike eukaryotic expression vectors. Vectors may be modified to includeprostate cell specific promoters if needed.

Eukaryotic cell vectors used to construct the rDNA molecules of thepresent invention may further include a selectable marker that iseffective in an eukaryotic cell, preferably a drug resistance selectionmarker. A preferred drug resistance marker is the gene whose expressionresults in neomycin resistance, i.e., the neomycin phosphotransferase(neo) gene. (Southern et al., J Mol Anal Genet 1:327-341 (1982))Alternatively, the selectable marker can be present on a separateplasmid, and the two vectors are introduced by co-transfection of thehost cell, and selected by culturing in the appropriate drug for theselectable marker.

Host Cells Containing an Exogenously Supplied Coding Nucleic AcidMolecule

The present invention further provides host cells transformed with anucleic acid molecule of the present invention. The host cell can beeither prokaryotic or eukaryotic. Eukaryotic cells useful for expressionof proteins are not limited, so long as the cell line is compatible withcell culture methods and compatible with the propagation of theexpression vector and possible expression of the gene product. Preferredeukaryotic host cells include, but are not limited to, yeast, insect andmammalian cells, preferably vertebrate cells such as those from a mouse,rat, monkey, human or canine cell line. Preferred eukaryotic host cellsinclude Chinese hamster ovary (CHO) cells available from the ATCC asCCL61, NIH Swiss mouse embryo cells (NIH/3T3) available from the ATCC asCRL 1658, baby hamster kidney cells (BHK), and the like eukaryotictissue culture cell lines.

Any prokaryotic host can be used to replicate a rDNA molecule of theinvention. The preferred prokaryotic host is E. coli.

Transformation of appropriate cell hosts with a rDNA molecule of thepresent invention is accomplished by well known methods that typicallydepend on the type of vector used and host system employed. With regardto transformation of prokaryotic host cells, electroporation and salttreatment methods are typically employed, see, for example, Cohen etal., (1972) Proc Natl Acad Sci USA 69:2110 (1972); and Sambrook et al.(supra). With regard to transformation of vertebrate cells with vectorscontaining rDNAs, electroporation, cationic lipid or salt treatmentmethods are typically employed, see, for example, Graham et al., Virol52:456 (1973); and Wigler et al., Proc Natl Acad Sci USA 76:1373-1376(1979).

Successfully transformed cells, i.e., cells that contain a rDNA moleculeof the present invention, can be identified by well known techniquesincluding the selection for a selectable marker. For example, cellsresulting from the introduction of an rDNA of the present invention canbe cloned to produce single colonies. Cells from those colonies can beharvested, lysed and their DNA content examined for the presence of therDNA using a method such as that described by Southern, J Mol Biol98:503 (1975) or Berent et al., Biotech 3:208 (1985), or the proteinsproduced from the cell assayed via an immunological method.

Nucleic Acid Assay Formats

The genes and sequences described herein may be used in a variety ofnucleic acid detection assays to detect or quantititate the expressionlevel of a gene or multiple genes in a given sample.

Any assay format to detect gene expression may be used. For example,traditional Northern blotting, dot or slot blot, nuclease protection,primer directed amplification, RT-PCR, semi- or quantitative PCR,branched-chain DNA and differential display methods may be used fordetecting gene expression levels. Those methods are useful for someembodiments of the invention. In cases where smaller numbers of genesare detected, amplification based assays may be most efficient. Methodsand assays of the invention, however, may be most efficiently designedwith hybridization-based methods for detecting the expression of a largenumber of genes.

Any hybridization assay format may be used, including solution-based andsolid support-based assay formats. Solid supports containingoligonucleotide probes based on the genes of the invention can befilters, polyvinyl chloride dishes, particles, beads, microparticles orsilicon or glass based chips, etc. Such chips, wafers and hybridizationmethods are widely available, for example, those disclosed by Beattie(WO 95/11755).

Any solid surface to which oligonucleotides can be bound, eitherdirectly or indirectly, either covalently or non-covalently, can beused. A preferred solid support is a high density array or DNA chip.Solid supports also include beads, sets of beads, membranes, and otherformats using any material, including glass and/or silicon. When beadsor sets of beads are the support, one or more than one species of probeor oligonucleotide may be attached to each bead. In one embodiment, eachspecies of probe or oligonucleotide is attached each to a differentbead, and the set of beads comprises all or a subset of the nucleic acidmolecules described herein. These contain a particular oligonucleotideprobe in a predetermined location on the array. Each predeterminedlocation may contain more than one molecule of the probe, but eachmolecule within the predetermined location has an identical sequence.Such predetermined locations are termed features. There may be, forexample, from 2, 10, 100, 1000 to 10,000, 100,000 or 400,000 or more ofsuch features on a single solid support. The solid support, or the areawithin which the probes are attached may be on the order of about asquare centimeter. For instance, about 10,000, 100,000 or more probesmay be attached per square centimeter. Probes may be attached to singleor multiple solid support structures, e.g., the probes may be attachedto a single chip or to multiple chips to comprise a chip set.

Oligonucleotide probe arrays for expression monitoring can be made andused according to any techniques known in the art (see for example,Lockhart et al., Nat Biotechnol 14:1675-1680 (1996); McGall et al., ProcNat Acad Sci USA 93:13555-13460 (1996)). Such probe arrays may containat least one or more oligonucleotides that are complementary to orhybridize to one or more of the genes or their transcripts. Forinstance, such arrays may contain oligonucleotides that arecomplementary or hybridize to at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,20, 30, 50, 70, 100, 500, 1000, 2000, 5000, 10,000 or more of the genesdescribed herein. Preferred arrays contain all or nearly all of thegenes described herein, for instance, at least about 90%, 95%, 97%, 99%or 99.5% of the sequences herein described. In a preferred embodiment,arrays are constructed that contain oligonucleotides to detect all ornearly all of the genes on a solid support substrate, such as a chip.Such arrays may represent all or nearly all of the entire expressedgenome of a dog.

As described above, in addition to the sequences disclosed, sequencessuch as naturally occurring variant or polymorphic sequences may be usedin the methods and compositions of the invention. For instance,expression levels of various allelic or homologous forms of a gene maybe assayed. Any and all nucleotide variations that do not alter thefunctional activity of a gene, including all naturally occurring allelicvariants of the genes herein disclosed, may be used in the methods andto make the compositions (e.g., arrays) of the invention.

Probes based on the sequences of the genes described above may beprepared by any commonly available method. Oligonucleotide probes forscreening or assaying a tissue or cell sample are preferably ofsufficient length to specifically hybridize only to appropriate,complementary genes or transcripts. Typically the oligonucleotide probeswill be at least about 10, 12, 14, 16, 18, 20 or 25 nucleotides inlength. In some cases, longer probes of at least 30, 40, or 50nucleotides will be desirable.

As used herein, oligonucleotide sequences that are complementary to oneor more of the genes refer to oligonucleotides that are capable ofhybridizing under stringent conditions to at least part of thenucleotide sequences of said genes.

“Bind(s) substantially” refers to complementary hybridization between aprobe nucleic acid and a target nucleic acid and embraces minormismatches that can be accommodated by reducing the stringency of thehybridization media to achieve the desired detection of the targetpolynucleotide sequence.

The terms “background” or “background signal intensity” refer tohybridization signals resulting from non-specific binding, or otherinteractions, between the labeled target nucleic acids and components ofthe oligonucleotide array (e.g. the oligonucleotide probes, controlprobes, the array substrate, etc.). Background signals may also beproduced by intrinsic fluorescence of the array components themselves. Asingle background signal can be calculated for the entire array, or adifferent background signal may be calculated for each target nucleicacid. In a preferred embodiment, background is calculated as the averagehybridization signal intensity for the lowest 5% to 10% of the probes inthe array, or, where a different background signal is calculated foreach target gene, for the lowest 5% to 10% of the probes for each gene.One-of skill in the art will appreciate that where the probes to aparticular gene hybridize well and thus appear to be specificallybinding to a target sequence, they should not be used in a backgroundsignal calculation. Alternatively, background may be calculated as theaverage hybridization signal intensity produced by hybridization toprobes that are not complementary to any sequence found in the sample(e.g. probes directed to nucleic acids of the opposite sense or to genesnot found in the sample such as bacterial genes where the sample ismammalian nucleic acids). Background can also be calculated as theaverage signal intensity produced by regions of the array that lack anyprobes at all.

The phrase “hybridizing specifically to” refers to the binding,duplexing, or hybridizing of a molecule substantially to or only to aparticular nucleotide sequence or sequences under stringent conditionswhen that sequence is present in a complex mixture (e.g., totalcellular) DNA or RNA.

Assays and methods of the invention may utilize available formats tosimultaneously screen at least about 100, about 1000, about 10,000 orabout 1,000,000 different nucleic acid hybridizations.

As used herein a “probe” is defined as a nucleic acid, capable ofbinding to a target nucleic acid of complementary sequence through oneor more types of chemical bonds, usually through complementary basepairing, usually through hydrogen bond formation. As used herein, aprobe may include natural (i.e., A, G, U, C, or T) or modified bases(7-deazaguanosine, inosine, etc.). In addition, the bases in probes maybe joined by a linkage other than a phosphodiester bond, so long as itdoes not interfere with hybridization. Thus, probes may be peptidenucleic acids in which the constituent bases are joined by peptide bondsrather than phosphodiester linkages.

The term “perfect match probe” refers to a probe that has a sequencethat is perfectly complementary to a particular target sequence. Thetest probe is typically perfectly complementary to a portion(subsequence) of the target sequence. The perfect match (PM) probe canbe a “test probe”, a “normalization control” probe, an expression levelcontrol probe and the like. A perfect match control or perfect matchprobe is, however, distinguished from a “mismatch control” or “mismatchprobe.”

The terms “mismatch control” or “mismatch probe” refer to a probe whosesequence is deliberately selected not to be perfectly complementary to aparticular target sequence. For each mismatch CAM) control in ahigh-density array there typically exists a corresponding perfect match(PM) probe that is perfectly complementary to the same particular targetsequence. The mismatch may comprise one or more bases.

While the mismatch(s) may be located anywhere in the mismatch probe,terminal mismatches are less desirable as a terminal mismatch is lesslikely to prevent hybridization of the target sequence. In aparticularly preferred embodiment, the mismatch is located at or nearthe center of the probe such that the mismatch is most likely todestabilize the duplex with the target sequence under the testhybridization conditions.

The term “stringent conditions” refers to conditions under which a probewill hybridize to its target subsequence, but with only insubstantialhybridization to other sequences or to other sequences such that thedifference may be identified. Stringent conditions aresequence-dependent and will be different in different circumstances.Longer sequences hybridize specifically at higher temperatures.Generally, stringent conditions are selected to be about 5° C. lowerthan the thermal melting point (Tm) for the specific sequence at adefined ionic strength and pH.

Typically, stringent conditions will be those in which the saltconcentration is at least about 0.01 to 1.0 M Na⁺ ion concentration (orother salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C. for short probes (e.g., 10 to 50 nucleotides). Stringent conditionsmay also be achieved with the addition of destabilizing agents such asformamide.

The “percentage of sequence identity” or “sequence identity” isdetermined by comparing two optimally aligned sequences or subsequencesover a comparison window or span, wherein the portion of thepolynucleotide sequence in the comparison window may optionally compriseadditions or deletions (i.e., gaps) as compared to the referencesequence (which does not comprise additions or deletions) for optimalalignment of the two sequences. The percentage is calculated bydetermining the number of positions at which the identical submit (e.g.nucleic acid base or amino acid residue) occurs in both sequences toyield the number of matched positions, dividing the number of matchedpositions by the total number of positions in the window of comparisonand multiplying the result by 100 to yield the percentage of sequenceidentity. Percentage sequence identity when calculated using theprograms GAP or BESTFIT (see below) is calculated using default gapweights. In an embodiment of the invention, the percent sequenceidentity is at least about 90% across 90% of the entire length of agiven sequence.

Probe Design

One of skill in the art will appreciate that an enormous number of arraydesigns are suitable for the practice of this invention. The highdensity array will typically include a number of test probes thatspecifically hybridize to the sequences of interest. Probes may beproduced from any region of the genes identified herein and the attachedrepresentative sequence listing. See WO 99/32660 for methods ofproducing probes for a given gene or genes. In addition, any availablesoftware may be used to produce specific probe sequences, including, forinstance, software available from Molecular Biology Insights, OlympusOptical Co. and Premier Biosoft International. In a preferredembodiment, the array will also include one or more control probes.

High density array chips of the invention include “test probes.” Testprobes may be oligonucleotides that range from about 5 to about 500, orabout 7 to about 50 nucleotides, more preferably from about 10 to about40 nucleotides and most preferably from about 15 to about 35 nucleotidesin length. In other particularly preferred embodiments, the probes are20 or 25 nucleotides in length. In another preferred embodiment, testprobes are double or single strand DNA sequences. DNA sequences areisolated or cloned from natural sources or amplified from naturalsources using native nucleic acid as templates. These probes havesequences complementary to particular subsequences of the genes whoseexpression they are designed to detect. Thus, the test probes arecapable of specifically hybridizing to the target nucleic acid they areto detect.

In addition to test probes that bind the target nucleic acid(s) ofinterest, the high density array can contain a number of control probes.The control probes may fall into three categories referred to hereinas 1) normalization controls; 2) expression level controls; and 3)mismatch controls.

Normalization controls are oligonucleotide or other nucleic acid probesthat are complementary to labeled reference oligonucleotides or othernucleic acid sequences that are added to the nucleic acid sample to bescreened. The signals obtained from the normalization controls afterhybridization provide a control for variations in hybridizationconditions, label intensity, “reading” efficiency and other factors thatmay cause the signal of a perfect hybridization to vary between arrays.In a preferred embodiment, signals (e.g., fluorescence intensity) readfrom all other probes in the array are divided by the signal (e.g.,fluorescence intensity) from the control probes thereby normalizing themeasurements.

Virtually any probe may serve as a normalization control. However, it isrecognized that hybridization efficiency varies with base compositionand probe length. Preferred normalization probes are selected to reflectthe average length of the other probes present in the array, however,they can be selected to cover a range of lengths. The normalizationcontrol(s) can also be selected to reflect the (average) basecomposition of the other probes in the array, however in a preferredembodiment, only one or a few probes are used and they are selected suchthat they hybridize well (i.e., no secondary structure) and do not matchany target-specific probes.

Expression level controls are probes that hybridize specifically withconstitutively expressed genes in the biological sample. Virtually anyconstitutively expressed gene provides a suitable target for expressionlevel controls. Typically expression level control probes have sequencescomplementary to subsequences of constitutively expressed “housekeepinggenes” including, but not limited to the actin gene, the transferrinreceptor gene, the GAPDH gene, and the like.

Mismatch controls may also be provided for the probes to the targetgenes, for expression level controls or for normalization controls.Mismatch controls are oligonucleotide probes or other nucleic acidprobes identical to their corresponding test or control probes exceptfor the presence of one or more mismatched bases. A mismatched base is abase selected so that it is not complementary to the corresponding basein the target sequence to which the probe would otherwise specificallyhybridize. One or more mismatches are selected such that underappropriate hybridization conditions (e.g., stringent conditions) thetest or control probe would be expected to hybridize with its targetsequence, but the mismatch probe would not hybridize (or would hybridizeto a significantly lesser extent) Preferred mismatch probes contain acentral mismatch. Thus, for example, where a probe is a 20 mer, acorresponding mismatch probe will have the identical sequence except fora single base mismatch (e.g., substituting a G, a C or a T for an A) atany of positions 6 through 14 (the central mismatch).

Mismatch probes thus provide a control for non-specific binding or crosshybridization to a nucleic acid in the sample other than the target towhich the probe is directed. For example, if the target is present theperfect match probes should be consistently brighter than the mismatchprobes. In addition, if all central mismatches are present, the mismatchprobes can be used to detect a mutation, for instance, a mutation of agene comprising one of SEQ ID NOS: 1-11,109. The difference in intensitybetween the perfect match and the mismatch probe provides a good measureof the concentration of the hybridized material.

Nucleic Acid Samples

Any canine cell or tissue sample may be used in the methods and assaysof the invention. Cell or tissue samples used in the assays of theinvention may be produced, grown, cultured, etc. in vitro or in vivo.When cultured cells or tissues are used, appropriate mammalian liverextracts may also be added with a test agent to evaluate agents that mayrequire biotransformation to exhibit toxicity. In a preferred format,primary isolates of animal or canine hepatocytes which already expressthe appropriate complement of drug-metabolizing enzymes may be exposedto the test agent without the addition of mammalian liver extracts.

The genes which are assayed according to the present invention aretypically in the form of mRNA or reverse transcribed mRNA. The genes maybe cloned or not. The genes may be amplified or not. The cloning and/oramplification do not appear to bias the representation of genes within apopulation. In some assays, it may be preferable, however, to usepolyA+RNA as a source, as it can be used with less processing steps.

As is apparent to one of ordinary skill in the art, nucleic acid samplesused in the methods and assays of the invention may be prepared by anyavailable method or process. Methods of isolating total mRNA are wellknown to those of skill in the art. For example, methods of isolationand purification of nucleic acids are described in detail in Chapter 3of Laboratory Techniques in Biochemistry and Molecular Biology, Vol. 24,Hybridization With Nucleic Acid Probes: Theory and Nucleic Acid Probes,P. Tijssen, Ed., Elsevier Press, New York, 1993. Such samples includeRNA samples, but also include cDNA synthesized from a mRNA sampleisolated from a cell or tissue of interest. Such samples also includeDNA amplified from the cDNA, and RNA transcribed from the amplified DNA.One of skill in the art would appreciate that it is desirable to inhibitor destroy RNase present in homogenates before homogenates are used.

Biological samples may be of any biological tissue or fluid or cells, aswell as cells raised in vitro, such as cell lines and tissue culturecells. Frequently the sample will be a tissue or cell sample that hasbeen exposed to a compound, agent, drug, pharmaceutical composition,potential environmental pollutant or other composition. In some formats,the sample will be a “clinical sample.” Typical clinical samplesinclude, but are not limited to, blood, blood-cells (e.g., white cells),tissue or fine needle biopsy samples, urine, peritoneal fluid, andpleural fluid, or cells therefrom.

Biological samples may also include sections of tissues, such as frozensections or formalin fixed sections taken for histological purposes.

Forming High Density Arrays

Methods of forming high density arrays of oligonucleotides with aminimal number of synthetic steps are known. The oligonucleotideanalogue array can be synthesized on a single or on multiple solidsubstrates by a variety of methods, including, but not limited to,light-directed chemical coupling, and mechanically directed coupling(see Pirrung, U.S. Pat. No. 5,143,854).

In brief, the light-directed combinatorial synthesis of oligonucleotidearrays on a glass surface proceeds using automated phosphoramiditechemistry and chip masking techniques. In one specific implementation, aglass surface is derivatized with a silane reagent containing afunctional group, e.g., a hydroxyl or amine group blocked by aphotolabile protecting group. Photolysis through a photolithogaphic maskis used selectively to expose functional groups which are then ready toreact with incoming 5′ photoprotected nucleoside phosphoramidites. Thephosphoramidites react only with those sites which are illuminated (andthus exposed by removal of the photolabile blocking group). Thus, thephosphoramidites only add to those areas selectively exposed from thepreceding step. These steps are repeated until the desired array ofsequences have been synthesized on the solid surface. Combinatorialsynthesis of different oligonucleotide analogues at different locationson the array is determined by the pattern of illumination duringsynthesis and the order of addition of coupling reagents.

In addition to the foregoing, additional methods which can be used togenerate an array of oligonucleotides on a single substrate aredescribed in PCT Publication Nos. WO 93/09668 and WO 01/23614. Highdensity nucleic acid arrays can also be fabricated by depositingpre-made or natural nucleic acids in predetermined positions.Synthesized or natural nucleic acids are deposited on specific locationsof a substrate by light directed targeting and oligonucleotide directedtargeting. Another embodiment uses a dispenser that moves from region toregion to deposit nucleic acids in specific spots.

Hybridization

Nucleic acid hybridization simply involves contacting a probe and targetnucleic acid under conditions where the probe and its complementarytarget can form stable hybrid duplexes through complementary basepairing. See WO 99/32660. The nucleic acids that do not form hybridduplexes are then washed away leaving the hybridized nucleic acids to bedetected, typically through detection of an attached detectable label.It is generally recognized that nucleic acids are denatured byincreasing the temperature or decreasing the salt concentration of thebuffer containing the nucleic acids. Under low stringency conditions(e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA,RNA:RNA, or RNA:DNA) will form even where the annealed sequences are notperfectly complementary. Thus, specificity of hybridization is reducedat lower stringency. Conversely, at higher stringency (e.g., highertemperature or lower salt) successful hybridization tolerates fewermismatches. One of skill in the art will appreciate that hybridizationconditions may be selected to provide any degree of stringency.

In a preferred embodiment, hybridization is performed at low stringency,in this case in 6× SSPET at 37° C. (0.005% Triton X-100), to ensurehybridization and then subsequent washes are performed at higherstringency (e.g., 1× SSPET at 37° C.) to eliminate mismatched hybridduplexes. Successive washes may be performed at increasingly higherstringency (e.g., down to as low as 0.25× SSPET at 37° C. to 50° C.)until a desired level of hybridization specificity is obtained.Stringency can also be increased by addition of agents such asformamide. Hybridization specificity may be evaluated by comparison ofhybridization to the test-probes with hybridization to the variouscontrols that can be present (e.g., expression level control,normalization control, mismatch controls, etc.).

In general, there is a tradeoff between hybridization specificity(stringency) and signal intensity. Thus, in a preferred embodiment, thewash is performed at the highest stringency that produces consistentresults and that provides a signal intensity greater than approximately10% of the background intensity. Thus, in a preferred embodiment, thehybridized array may be washed at successively higher stringencysolutions and read between each wash. Analysis of the data sets thusproduced will reveal a wash stringency above which the hybridizationpattern is not appreciably altered and which provides adequate signalfor the particular oligonucleotide probes of interest.

Signal Detection

The hybridized nucleic acids are typically detected by detecting one ormore labels attached to the sample nucleic acids. The labels may beincorporated by any of a number of means well known to those of skill inthe art. See WO 99/32660.

Databases

The present invention includes relational databases containing sequenceinformation, for instance, for the genes herein described, as well asgene expression information from tissue or cells, such as canine cellsor tissue exposed to various standard compounds, such as toxins.Databases may also contain information associated with a given sequenceor tissue sample such as descriptive information about the geneassociated with the sequence information, or descriptive informationconcerning the clinical status of the tissue sample, or the animal fromwhich the sample was derived. The database may be designed to includedifferent parts, for instance a sequence database and a gene expressiondatabase. Methods for the configuration and construction of suchdatabases and computer-readable media to which such databases are savedare widely available, for instance, see U.S. Pat. No. 5,953,727, whichis herein incorporated by reference in its entirety.

The databases of the invention may be linked to an outside or externaldatabase such as GenBank (www.ncbi.nlm.nih.gov/entrez.index.html); KEGG(www.genome.ad.jp/kegg); SPAD (www.grt.kyzushu-u.ac.jp/spad/index.html);HUGO (www.gene.ucl.ac.uk/hugo); Swiss-Prot (www.expasy.ch.sprot);Prosite (www.expasy.ch/tools/scnpsit1.html); OMIM(www.ncbi.nlm.nih.gov/omin); and GDB (www.gdb.org). In a preferredembodiment, the external database is GenBank and the associateddatabases maintained by the National Center for BiotechnologyInformation (NCBI) (www.ncbi.nlm.nih.gov).

Any appropriate computer platform, user interface, etc. may be used toperform the necessary comparisons between sequence information, geneexpression information and any other information in the database orinformation provided as an input. For example, a large number ofcomputer workstations are available from a variety of manufacturers,such has those available from Silicon Graphics. Client/serverenvironments, database servers and networks are also widely availableand appropriate platforms for the databases of the invention.

The databases of the invention may be used to produce, among otherthings, electronic Northerns that allow the user to determine the celltype or tissue in which a given gene is expressed and to allowdetermination of the abundance or expression level of a given gene in aparticular tissue or cell.

The databases of the invention may also be used to present informationidentifying the expression level in a tissue or cell of a set of genescomprising one or more of the genes of SEQ ID NOS: 1-11,109, comprisingthe step of comparing the expression level of at least one gene in acell or tissue exposed to a test agent to the level of expression of thegene in the database. Such methods may be used to predict the toxicpotential of a given compound by comparing the level of expression of agene or genes from a tissue or cell sample exposed to the test agent tothe expression levels found in a control tissue or cell samples exposedto a standard toxin or hepatotoxin such as those herein described. Suchmethods may also be used in the drug or agent screening assays asdescribed herein.

Kits

The invention further includes kits combining, in differentcombinations, high-density oligonucleotide arrays, reagents for use withthe arrays, protein reagents encoded by the genes herein described,signal detection and array-processing instruments, gene expressiondatabases and analysis and database management software described above.The kits may be used, for example, to predict or model the toxicresponse of a test compound, to monitor the progression of diseasestates, to identify genes that show promise as new drug targets and toscreen known and newly designed drugs as discussed above.

The databases packaged with the kits may be a compilation of expressionpatterns of the genes in various tissues or in tissues, including cellor tissue samples, exposed to various compounds or reference toxins. Inparticular, the database software and packaged information that maycontain the databases saved to a computer-readable medium include theexpression results of the genes that can be used to predict toxicity ofa test agent, by comparing the expression levels of the genes induced bythe test agent to the expression levels in control samples. In anotherformat, database and software information may be provided in a remoteelectronic format, such as a website, the address of which may bepackaged in the kit.

The kits may used in the pharmaceutical industry, where the need forearly drug testing is strong due to the high costs associated with drugdevelopment, but where bioinformatics, in particular gene expressioninformatics, is still lacking. These kits will reduce the costs, timeand risks associated with traditional new drug screening using cellcultures and laboratory animals. The results of large-scale drugscreening of pre-grouped patient populations, pharmacogenomics testing,can also be applied to select drugs with greater efficacy and fewerside-effects. The kits may also be used by smaller biotechnologycompanies and research institutes who do not have the facilities forperforming such large-scale testing themselves.

Databases and software designed for use with use with microarrays isdiscussed in Balaban et al., U.S. Pat. No. 6,229,911, acomputer-implemented method for managing information, stored as indexedtables, collected from small or large numbers of microarrays, and U.S.Pat. No. 6,185,561, a computer-based method with data mining capabilityfor collecting gene expression level data, adding additional attributesand reformatting the data to produce answers to various queries. Chee etal., U.S. Pat. No. 5,974,164,1 discloses a software-based method foridentifying mutations in a nucleic acid sequence based on differences inprobe fluorescence intensities between wild type and mutant sequencesthat hybridize to reference sequences.

Identification of Marker Genes

Cell or tissue samples such as those associated with a disease state,for example, may be analyzed using the microarray chip of the invention,and gene expression profiles may be prepared. Expression levels of genesidentified as marker genes, based on their properties as an indicator ofa disease state, or as an indicator of normal functioning, for example,may be measured and then used to monitor a variety of medical treatmentsor in diagnostic procedures. Marker genes may be used in pharmaceuticaldevelopment to monitor the degree of apoptosis or effect of treatmentwith pharmaceuticals, such as beta-adrenergic blocking agents.Additionally, the expression level of genes involved in the developmentof carcinomas or autoimmune disorders may be measured. In gene therapy,monitoring the expression of marker genes provides an indication of thelevel of genes delivered by various viral and synthetic non-viralvectors.

Identification of Toxicity Markers

To evaluate and identify gene expression changes that are predictive oftoxicity, studies using selected compounds with well characterizedtoxicity can be used to catalogue altered gene expression duringexposure in vivo and in vitro. For instance, canine cell or tissuesamples can be prepared by administering a toxin or a control to acanine subject and harvesting tissue or cell samples after exposure. Inanother embodiment, in vitro cultured canine cells are exposed to thetoxin. Methods of exposure or administration and methods of preparingcell or tissue samples are well known in the art. See, for example, PCTpublication nos. WO 02/10453 and WO 02/095000, as well as PCTapplication nos. PCT/US02/21735, filed Jul. 10, 2002, andPCT/US03/03194, filed Jan. 31, 2003, all of which are hereinincorporated by reference. In the instant invention, standard knowntoxins such as acyclovir, amitryptiline, alpha-naphthylisothiocyante(ANIT), acetaminophen, AY-25329, bicalutamide, carbon tetrachloride,chloroform, clofibrate, cyproterone acetate (CPA), diclofenac,diflunisal, dioxin, 17α-ethinylestradiol, hydrazine, indomethacin,lipopolysaccharide, phenobarbital, tacrine, valproate, WY-14643,zileuton, methotrexate, lovastatin, mercuric chloride, cephaloridine,ifosfamide, cyclophosphamide and minoxidil, 2-acetylaminofluorene(2-AAF), amiodarone, BI liver toxin, carbamazepine, chlorpromazine,CI-1000, colchicine, dimethylnitrosamine (DMN), gemfibrozil, imipramine,menadione, tamoxifen, tetracycline, thioacetamide, adriamycin,bromoethylamine HBr, carboplatin, cidorfovir, cis-platin, citrinin,cyclophosphamide, gentamicin, hydralizine, lithium, pamindronate,puromycin aminonucleoside, sulfadiazine, sodium chromate, sodiumoxalate, vancomycin, BI-QT, clenbuterol, isoproteranol, norepinephrine,epinephrine, amphotericin B, epirubicin, phenylpropanolamine,rosiglitazone and 1-methyl-4phenyl-1,2,3,6-tetrahydropyridine HCl (MPTP)may be used to produce toxin-specific and composite gene expressionprofiles.

Toxicity Prediction and Modeling

The genes and gene expression information, as well as the portfolios andsubsets of the genes that may be identified using the sequences andarrays of the invention, may be used to predict at least one toxiceffect, such as the hepatotoxicity or nephrotoxicity of a test orunknown compound. As used, herein, at least one toxic effect includes,but is not limited to, a detrimental change in the physiological statusof a cell or organism. The response may be, but is not required to be,associated with a particular pathology, such as tissue necrosis. Theresponse may be associated with all or only part of an organ, e.g.,renal tubular necrosis or glomerulonephritis. Additionally, the toxiceffect includes effects at the molecular and cellular level.Hepatotoxicity is an effect as used herein and includes but is notlimited to the pathologies of liver necrosis, hepatitis, fatty liver andprotein adduct formation.

In general, assays to predict the toxicity of a test agent (or compoundor multi-component composition) comprise the steps of exposing a cellpopulation to the test compound, assaying or measuring the level ofrelative or absolute gene expression of one or more of the genes asherein described and comparing the identified expression level(s) to theexpression level(s) found for a standard toxin. Assays may include themeasurement of the expression levels of about 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 15, 20, 25, 30, 50, 75, 100, 500, 1000, 5000, 10,000 or more genes.

In the methods of the invention, the gene expression level for a gene orgenes induced by the test agent, compound or compositions may becomparable to the levels found in the databases disclosed herein, or inother samples, such as toxin-exposed samples, if the expression levelvaries within a factor of about 2, about 1.5 or about 1.0 fold. In somecases, the expression levels are comparable if the agent induces achange in the expression of a gene in the same direction (e.g., up ordown) as a reference toxin.

The cell population that is exposed to the test agent, compound orcomposition may be exposed in vitro or in vivo. For instance, culturedor freshly isolated hepatocytes, in particular dog hepatocytes, may beexposed to the agent under standard laboratory and cell cultureconditions. In another assay format, in vivo exposure may beaccomplished by administration of the agent to a living animal, forinstance a laboratory dog.

Procedures for designing and conducting toxicity tests in in vitro andin vivo systems are well known, and are described in many texts on thesubject, such as Loomis et al., Loomis's Esstentials of Toxicology 4thEd., Academic Press, New York, 1996; Echobichon, The Basics of ToxicityTesting, CRC Press, Boca Raton, 1992; Frazier, editor, In Vitro ToxicityTesting, Marcel Dekker, New York, 1992; and the like.

In in vitro toxicity testing, two groups of test organisms are usuallyemployed: One group serves as a control and the other group receives thetest compound in a single dose (for acute toxicity tests) or a regimenof doses (for prolonged or chronic toxicity tests). Because, in somecases, the extraction of tissue as called for in the methods of theinvention requires sacrificing the test animal, both the control groupand the group receiving compound must be large enough to permit removalof animals for sampling tissues, if it is desired to observe thedynamics of gene expression through the duration of an experiment.

In setting up a toxicity study, extensive guidance is provided in theliterature for selecting the appropriate test organism for the compoundbeing tested, route of administration. dose ranges, and the like. Wateror physiological saline (0.9% NaCl in water) is the solute of choice forthe test compound since these solvents permit administration by avariety of routes. When this is not possible because of solubilitylimitations, vegetable oils such as corn oil or organic solvents such aspropylene glycol may be used.

Regardless of the route of administration, the volume required toadminister a given dose is limited by the size of the animal that isused. It is desirable to keep the volume of each dose uniform within andbetween groups of animals. Even when aqueous or physiological salinesolutions are used for parenteral injection, the volumes that aretolerated are limited, although such solutions are ordinarily thought ofas being innocuous. In some instances, the route of administration tothe test animal should be the same as, or as similar as possible to, theroute of administration of the compound to man for therapeutic purposes.

When a compound is to be administered by inhalation, special techniquesfor generating test atmospheres are necessary. The methods usuallyinvolve aerosolization or nebulization of fluids containing thecompound. If the agent to be tested is a fluid that has an appreciablevapor pressure, it may be administered by passing air through thesolution under controlled temperature conditions. Under theseconditions, dose is estimated from the volume of air inhaled per unittime, the temperature of the solution, and the vapor pressure of theagent involved. Gases are metered from reservoirs. When particles of asolution are to be administered, unless the particle size is less thanabout 2 μm the particles will not reach the terminal alveolar sacs inthe lungs. A variety of apparatuses and chambers are available toperform studies for detecting effects of irritant or other toxicendpoints when they are administered by inhalation. The preferred methodof administering an agent to animals is via the oral route, either byintubation or by incorporating the agent in the feed.

When the agent is exposed to cells in vitro or in cell culture, the cellpopulation to be exposed to the agent may be divided into two or moresubpopulations, for instance, by dividing the population into two ormore identical aliquots. In some preferred embodiments of the methods ofthe invention, the cells to be exposed to the agent are derived fromliver tissue. For instance, cultured or freshly isolated rat hepatocytesmay be used.

The methods of the invention may be used to generally predict at leastone toxic response, and as described in the Examples, may be used topredict the likelihood that a compound or test agent will induce variousspecific pathologies such as those of the liver (liver necrosis, fattyliver disease, protein adduct formation or hepatitis), those of thekidney, heart, brain or testes, or other pathologies associated with atleast one of the toxins herein described. The methods of the inventionmay also be used to determine the similarity of a toxic response to oneor more individual compounds. In addition, the methods of the inventionmay be used to predict or elucidate the potential cellular pathwaysinfluenced, induced or modulated by the compound or test agent due tothe similarity of the expression profile compared to the profile inducedby a known toxin.

Diagnostic Uses for the Toxicity Markers

As described above, the genes and gene expression information orportfolios of the genes with their expression information may be used asdiagnostic markers for the prediction or identification of thephysiological state of tissue or cell sample that has been exposed to acompound or to identify or predict the toxic effects of a compound oragent. For instance, a tissue sample such as a sample of peripheralblood cells or some other easily obtainable tissue sample may be assayedby any of the methods described above, and the expression levels from agene or genes may be compared to the expression levels found in tissuesor cells exposed to the toxins described herein. These methods mayresult in the diagnosis of a physiological state in the cell or may beused to identify the potential toxicity of a compound, for instance anew or unknown compound or agent. The comparison of expression data, aswell as available sequence or other information may be done byresearcher or diagnostician or may be done with the aid of a computerand databases as described below.

In another format, the levels of a gene or genes, the encodedprotein(s), or any metabolite produced by the encoded protein may bemonitored or detected in a sample, such as a bodily tissue or fluidsample to identify or diagnose a physiological state of an organism.Such samples may include any tissue or fluid sample, including urine,blood and easily obtainable cells such as peripheral lymphocytes.

Use of the Markers for Monitoring Toxicity Progression

As described above, the genes and gene expression information providedmay also be used as markers for the monitoring of toxicity progression,such as that found after initial exposure to a drug, drug candidate,toxin, pollutant, etc. For instance, a tissue or cell sample may beassayed by any of the methods described above, and the expression levelsfrom a gene or genes may be compared to the expression levels found intissue or cells exposed to a standard toxin or toxins. The comparison ofthe expression data, as well as available sequence or other informationmay be done by researcher or diagnostician or may be done with the aidof a computer and databases.

Use of the Toxicity Markers for Drug Screening

According to the present invention, the genes and arrays describedherein may be used to identify markers or drug targets to evaluate theeffects of a candidate drug, chemical compound or other agent on a cellor tissue sample. For instance, the genes may also be used as drugtargets to screen for agents that modulate their expression and/oractivity. In various formats, a candidate drug or agent can be screenedfor the ability to simulate the transcription or expression of a givenmarker or markers or to down-regulate or counteract the transcription orexpression of a marker or markers. According to the present invention,one can also compare the specificity of a drug's effects by looking atthe number of markers which the drug induces and comparing them. Morespecific drugs will have less transcriptional targets. Similar sets ofmarkers identified for two drugs may indicate a similarity of effects.

Assays to monitor the expression of a marker or markers may utilize anyavailable means of monitoring for changes in the expression level of thenucleic acids of the invention. As used herein, an agent is said tomodulate the expression of a nucleic acid of the invention if it iscapable of up- or down-regulating expression of the nucleic acid in acell.

In one assay format, gene chips containing probes to one, two or moregenes as described herein may be used to directly monitor or detectchanges in gene expression in the treated or exposed cell. Cell lines,tissues or other samples are first exposed to a test agent and in someinstances, a known toxin, and the detected expression levels of one ormore, or preferably 2 or more of the genes are compared to theexpression levels of those same genes exposed to a known toxin alone.Compounds that modulate the expression patterns of the known toxin(s)would be expected to modulate potential toxic physiological effects invivo.

Agents that are assayed in the above methods can be randomly selected orrationally selected or designed. As used herein, an agent is said to berandomly selected when the agent is chosen randomly without consideringthe specific sequences involved in the association of the a protein ofthe invention alone or with its associated substrates, binding partners,etc. An example of randomly selected agents is the use a chemicallibrary or a peptide combinatorial library, or a growth broth of anorganism.

As used herein, an agent is said to be rationally selected or designedwhen the agent is chosen on a nonrandom basis which takes into accountthe sequence of the target site and/or its conformation in connectionwith the agent's action. Agents can be rationally selected or rationallydesigned by utilizing the peptide sequences that make up these sites.For example, a rationally selected peptide agent can be a peptide whoseamino acid sequence is identical to or a derivative of any functionalconsensus site.

The agents of the present invention can be, as examples, peptides, smallmolecules, vitamin derivatives, as well as carbohydrates. Dominantnegative proteins, DNAs encoding these proteins, antibodies to theseproteins, peptide fragments of these proteins or mimics of theseproteins may be introduced into cells to affect function. “Mimic” usedherein refers to the modification of a region or several regions of apeptide molecule to provide a structure chemically different from theparent peptide but topographically and functionally similar to theparent peptide (see G. A. Grant in: Molecular Biology and Biotechnology,Meyers, ed., pp. 659-664, VCH Publishers, New York, 1995). A skilledartisan can readily recognize that there is no limit as to thestructural nature of the agents of the present invention.

Use of Assays and Genes for Veterinary Medicine

The genes and arrays described herein may be used in veterinarymedicine, for instance, to produce canine gene expression profilesindicative of a disease or physiological state. For instance, geneexpression profiles may be created using arrays of the invention fromperipheral blood cells isolated from an animal with a known diseasestate, for example, an inflammatory disease. Such gene expressionprofiles can then be used as diagnostic or therapeutic markers to aid inprediction of disease, to monitor treatment progression or efficacy, orto monitor disease progression (see WO 99/10536).

Without further description, it is believed that one of ordinary skillin the art can, using the preceding description and the followingillustrative examples, make and utilize the compounds of the presentinvention and practice the claimed methods. The following workingexamples therefore, specifically point out the preferred embodiments ofthe present invention, and are not to be construed as limiting in anyway the remainder of the disclosure.

EXAMPLES Example 1 Identification of Canine Nucleic Acid Sequences

A cDNA library of mixed canine tissues (liver, kidney, heart, brain andtestes) was produced according to standard methods. Following 3′ESTsequencing to identify individually expressed genes and gene fragments,these genes and gene fragments were further sequenced and were analyzedfor their homology to known sequences. Only sequences that showedalignment below a first threshold level (90%) to sequences in publicdatabases and that had identity below a second threshold level (90%)within the region of alignment were used to prepare microarrays.

Example 2 Preparation of Canine Microarray

Oligonucleotides of approximately 25 bases, corresponding to variousregions of the novel genes identified above, were synthesized accordingto standard methods. The oligonucleotides were spotted onto microchipsaccording to the Affymetrix photolithography protocol to createmicroarrays with over 100,000 sequences per chip. The chips were testedfor intra-lot variability, inter-lot variability and day-to-dayvariability. The chips were also tested for the specificity of bindingto canine RNA in hybridization experiments with RNA samples from variousspecies: dog, human, rat and mouse. As sample preparation for testingthe microarrays, total RNA was extracted from the following caninetissues and pooled: liver, kidney, heart, brain and testes. The pooledRNA was reverse transcribed to prepare cDNA and amplified by reactionwith a reverse polymerase to prepare cRNA.

A samples from dogs hybridized to the chips to a considerably greaterdegree than samples from other species. The percentages of sequences onthe chips that did and did not bind to RNA samples from other speciesare indicated in the following table. ave. % ave. % ave. % % genes at orabove organism present absent marginal 0.5 pM spike-in level human 7.191.2 1.8 3.5 rat 4.2 94.6 1.2 4.6 mouse 4.4 94.1 1.5 2.1

As a further control of specific hybridization, bacterial spikes wereperformed. Oligonucleotides designed from bacterial DNA sequences(Affymetrix) were incorporated into the microarrays, and canine RNAsamples were spiked with known quantities of purified bacterial DNA(Affymetrix).

Example 3 Identification of Toxicity Markers and Toxicity ExpressionProfiles

Laboratory dogs are exposed to toxins, such as gentamicin, according tothe following protocol. Gentamicin or vehicle (saline) is administeredto dogs as shown below. The toxin is also prepared in saline solution.Dose Level No. of Group Drug (mg/kg) Males Sacrifice 1 Saline vehiclecontrol 5 6 hours after dosing 2 Gentamicin X* 5 6 hours after dosing 3Gentamicin Y* 5 6 hours after dosing 4 Saline vehicle control 5 24 hoursafter dosing 5 Gentamicin X 5 24 hours after dosing 6 Gentamicin Y 5 24hours after dosing 7 Saline vehicle control 5 Day 7 8 Gentamicin X 5 Day7 9 Gentamicin Y 5 Day 7*X represents a safe but efficacious dose; Y represents a toxic ormaximum-tolerated dose.

The toxin is administered daily by intramuscular injection. Animals werenot dosed on the day of necropsy, with the exception of the 6-hour timepoint animals. ˜0.5 mL of blood from each animal is collected into anEDTA tube for analysis of plasma drug levels. Plasma (˜200 L) isobtained, frozen at ˜80° and used for test compound/metaboliteestimation.

Animals are observed twice daily for signs of illness and drug toxicity(e.g., tremors, convulsions, salivation, diarrhea, lethargy, coma orother atypical behavior or appearance). were recorded as they occurredand included a time of onset, degree, and duration.

Blood samples are collected from each animal as follows. Approximately 1mL of blood is collected into and EDTA tube for evaluation of hematologyparameters. Approximately 1 mL of blood is collected into serumseparator tubes for clinical chemistry analysis. An additional ˜2 mL ofblood is collected into a 15 mL conical polypropylene vial to which ˜3mL of Trizol is immediately added. The contents are mixed immediatelywith a vortex and by repeated inversion. The tubes are frozen in liquidnitrogen and stored at ˜−80° C.

At sacrifice, approximately 6 and 24 hours and 7 days after dosing, dogsscheduled for sacrifice are weighed, physically examined, and sacrificedby standard procedures using sterile, disposable instruments.

Fresh and sterile disposable instruments are used to collect tissues,with the exception of bone cutters that are-used to open the skull cap.These are sterilized between uses. All tissues are collected and frozenwithin approximately 5 minutes of the animal's death. The liver sectionsare frozen within approximately 2 minutes of the animal's death. Thetime of euthanasia, an interim time point at freezing of liver sections,and time at completion of necropsy are recorded. Tissues were stored atapproximately −80° C., stored in liquid nitrogen, or preserved in 10%neutral buffered formalin.

Tissue collection is performed as follows. For the liver, the rightmedial lobe is snap frozen in liquid nitrogen and stored at ˜−80° C. Theleft medial lobe is preserved in 10% neutral-buffered formalin (NBF),and the left lateral lobe is snap frozen in liquid nitrogen and storedat ˜−80° C.

For the heart, a sagittal cross-section containing portions of the twoatria and the two ventricles is preserved in 10% NBF for microscopicexamination. The remaining heart is frozen in liquid nitrogen and storedat ˜−80° C.

For the kidneys, each kidney is hemi-dissected. Half is preserved in 10%NBF for microscopic examination, and the remaining half is frozen inliquid nitrogen and stored at ˜−80° C.

For the testes, a sagittal cross-section of each testis is preserved in10% NBF for microscopic examination. The remaining testes are frozentogether in liquid nitrogen and stored at ˜−80° C.

For the brain, a cross-section of the cerebral hemispheres and of thediencephalon is preserved in 10% NBF and the rest of the brain is frozenin liquid nitrogen and stored at ˜−80° C.

Microarray sample preparation is conducted with minor modifications,following the protocols set forth in the Affymetrix GeneChip ExpressionAnalysis Manual. Frozen tissue is ground to a powder using a SpexCertiprep 6800 Freezer Mill. Total RNA is extracted with Trizol(GibcoBRL) utilizing the manufacturer's protocol. mRNA is isolated usingthe Oligotex mRNA Midi kit (Qiagen) followed by ethanol precipitation.Double stranded cDNA is generated from mRNA using the SuperScript Choicesystem (GibcoBRL). First strand cDNA synthesis is primed with aT7-(dT24) oligonucleotide. The cDNA is phenol-chloroform extracted andethanol precipitated to a final concentration of 1 g/ml. From 2 g ofcDNA, cRNA is synthesized using Ambion's T7 MegaScript in vitroTranscription Kit.

To biotin label the cRNA, nucleotides Bio-11-CTP and Bio-16-UTP (EnzoDiagnostics) are added to the reaction. Following a 37° C. incubationfor six hours, impurities are removed from the labeled cRNA followingthe RNeasy Mini kit protocol (Qiagen). cRNA is fragmented (fragmentationbuffer consisting of 200 mM Tris-acetate, pH 8.1, 500 mM KOAc, 150 mMMgOAc) for thirty-five minutes at 94° C. Following the Affymetrixprotocol, 55 g of fragmented cRNA is hybridized on the array chip, orchip set, of the invention for twenty-four hours at 60 rpm in a 45° C.hybridization oven. The chips are washed and stained with StreptavidinPhycoerythrin (SAPE) (Molecular Probes) in Affymetrix fluidics stations.To amplify staining, SAPE solution is added twice, with ananti-streptavidin biotinylated antibody (Vector Laboratories) stainingstep in between. Hybridization to the probe arrays is detected byfluorometric scanning (Hewlett Packard Gene Array Scanner). Data isanalyzed using Affymetrix GeneChip® version 3.0 and Expression DataMining (EDMT) software (version 1.0), GeneExpress2000, and S-Plus.

Those genes that are differentially expressed upon exposure togentamicin are identified using the microarray hybridization techniquesdescribed above, with data analysis according to a statistical methodsuch as ANOVA, LDA or PCA (see WO 02/10453 or WO 02/095000). The set ofgenes that are differentially expressed creates an expression profilefor a particular toxin. The determination of a particular geneexpression profile in a tissue sample from a particular animal indicatesa toxic response in that animal.

Although the present invention has been described in detail withreference to examples above, it is understood that various modificationscan be made without departing from the spirit of the invention.Accordingly, the invention is limited only by the following claims. Allcited patents, patent applications and publications referred to in thisapplication are herein incorporated by reference in their entirety.

1. An isolated nucleic acid molecule comprising any one of SEQ ID NOS:1-11,109, the complement thereof, or a sequence exhibiting greater than90% sequence identity across greater than 90% of the length of any oneof SEQ ID NOS: 1-11,109.
 2. A set of probes, wherein each of the probescomprises a sequence that specifically hybridizes to a gene or thetranscript of a gene comprising any one of SEQ ID NOS: 1-11,109.
 3. Aset of probes according to claim 2, wherein the set comprises probesthat specifically hybridize to at least 2 of the genes of SEQ ID NOS:1-11,109.
 4. A set of probes according to claim 2, wherein the setcomprises probes that specifically hybridize to at least about 5 of thegenes of SEQ ID NOS: 1-11,109.
 5. A set of probes according to claim 2,wherein the set comprises probes that specifically hybridize to at leastabout 10 of the genes of SEQ ID NOS: 1-11,109.
 6. A set of probesaccording to claim 2, wherein the set comprises probes that specificallyhybridize to at least about 100 of the genes of SEQ ID NOS: 1-11,109. 7.A set of probes according to claim 2, wherein the set comprises probesthat specifically hybridize to at least about 1000 of the genes of SEQID NOS: 1-11,109.
 8. A set of probes according to claim 2, wherein theset comprises probes that specifically hybridize to about 99% of thegenes of SEQ ID NOS: 1-11,109.
 9. A set of probes according to claim 2,wherein the set comprises probes that specifically hybridize to all ofthe genes of SEQ ID NOS: 1-11,109.
 10. A set of probes according toclaim 2, wherein the probes are attached to a solid support.
 11. A setof probes according to claim 10, wherein the solid support is selectedfrom the group consisting of a membrane, a set of beads, a glass supportand a silicon support.
 12. A solid support comprising at least oneprobe, wherein each probe comprises a sequence that specificallyhybridizes to a gene or the transcript of a gene comprising any one ofSEQ ID NOS: 1-11,109.
 13. A solid support of claim 12, wherein the solidsupport is an array comprising at least 10 different oligonucleotides indiscrete locations per square centimeter.
 14. A solid support of claim12, wherein the array comprises at least 100 different oligonucleotidesin discrete locations per square centimeter.
 15. A solid support ofclaim 12, wherein the array comprises at least 1000 differentoligonucleotides in discrete locations per square centimeter.
 16. Asolid support of claim 12, wherein the array comprises at least 10,000different oligonucleotides in discrete locations per square centimeter.17. A method of identifying tissue or cell markers, comprising: (a)detecting the level of expression in a tissue or cell sample from acanine of one or more genes comprising SEQ ID NOS: 1-11,109; whereindifferential expression of the one or more genes identifies a marker.18. A method of claim 17, further comprising: (b) comparing the level ofexpression of said one or more genes in step (a) to the level ofexpression of said genes in a control tissue or cell sample.
 19. Amethod of claim 17, wherein the level of expression of one or more genesis detected with a probe that specifically hybridizes to a gene or atranscript of the gene.
 20. A method of claim 19, wherein the probe isan oligonucleotide.
 21. A method of claim 20, wherein theoligonucleotide is attached to a solid support.
 22. A method of claim21, wherein the solid support is a chip.
 23. A method of claim 17,wherein the level of expression of one or more genes in step (a) isdetected by polymerase chain amplification (PCR).
 24. A method of claim23, wherein the PCR is quantitative or semi-quantitative.
 25. A methodof claim 17, wherein step (a) comprises preparing cDNA from polyA-RNAisolated from the tissue or cell sample exposed to the toxin.
 26. Amethod of claim 25, wherein cRNA is prepared from the cDNA.
 27. A methodof claim 17, wherein the tissue or cell sample is isolated from a dog orcanine cells that have been exposed to a toxin.
 28. A method of claim17, wherein the tissue or cell sample is in vitro cultured.
 29. A methodof identifying toxicity markers, comprising: (a) detecting the level ofexpression in a tissue or cell sample exposed to a toxin of one or moregenes comprising SEQ ID NOS: 1-11,109; wherein differential expressionof the one or more genes is indicative of toxicity.
 30. A method ofpreparing a gene expression profile of a tissue or cell sample,comprising: (a) detecting the level of expression in a first tissue orcell sample of one or more genes comprising SEQ ID NOS: 1-11,109; and(b) comparing the level of expression of said one or more genes in step(a) to the level of expression of said genes in a second tissue or cellsample.
 31. A method of claim 30, wherein the comparing comprisescalculating the differential expression for one or more genes in thefirst sample by dividing the level of expression for the one or moregenes in step (a) by the level of expression detected for thecorresponding one or more genes in the second tissue or cell sample. 32.A method of claim 31, wherein the first tissue or cell sample has beenexposed to a toxin.
 33. A method of claim 32, wherein the toxin isselected from the group consisting of a hepatotoxin, a nephrotoxin and acardiotoxin.
 34. A method of claim 33, wherein the hepatotoxin isselected from the group consisting of acyclovir, amitryptiline,alpha-naphthylisothiocyante (ANIT), acetaminophen, AY-25329,bicalutamide, carbon tetrachloride, chloroform, clofibrate, cyproteroneacetate (CPA), diclofenac, diflunisal, dioxin, 17α-ethinylestradiol,hydrazine, indomethacin, bacterial lipopolysaccharide, phenobarbital,tacrine, valproate, WY-14643, zileuton, 2-acetylaminofluorene (2-AAF),BI liver toxin, CI-1000, colchicine, dimethylnitrosamine (DMN),gemfibrozil, menadione, thioacetamide, methotrexate, lovastatin,amiodarone, carbamazepine, chlorpromazine, imipramine, tamoxifen andtetracycline.
 35. A method of claim 33, wherein the nephrotoxin isselected from the group consisting of acyclovir, adriamycin, AY-25329,bromoethylamine HBr, carboplatin, cephaloridine, chloroform, cidorfovir,cis-platin, citrinin, colchicine, cyclophosphamide, diclofenac,diflunisal, gentamicin, hydralizine, ifosfamide, indomethacin, lithium,menadione, mercuric chloride, pamindronate, puromycin aminonucleoside,sulfadiazine, sodium chromate, sodium oxalate, vancomycin,thioacetamide.
 36. A method of claim 33, wherein the cardiotoxin isselected from the group consisting of cyclophosphamide, hydralazine,ifosfamide, minoxidil, BI-QT, clenbuterol, isoproteranol,norepinephrine, epinephrine, adriamycin, amphotericin B, epirubicin,phenylpropanolamine, rosiglitazone.
 37. A method of preparing a geneexpression profile indicative of a toxic effect of a compound,comprising: (a) detecting the level of expression in a tissue or cellsample exposed to the compound of one or more genes comprising SEQ IDNOS: 1-11,109; and (b) comparing the level of expression of said one ormore genes in step (a) to the level of expression of said genes in acontrol tissue or cell sample.
 38. A method of screening an agent for apotential toxic response, comprising: (a) preparing a gene expressionprofile comprising the level of expression of one or more genescomprising SEQ ID NOS: 1-11,109 from a cell or tissue sample exposed tothe agent; and (b) comparing said gene expression profile to at leastone gene expression profile prepared from a cell or tissue sampleexposed to a known toxin.
 39. A method of claim 38, further comprising:(a1) comparing the gene expression profile from the agent exposed cellor tissue sample to a control cell or tissue sample prior to thecomparing of step (b).
 40. A method of claim 38, wherein the level ofexpression of one or more genes is detected with a probe thatspecifically hybridizes to a gene or a transcript of the gene.
 41. Amethod of claim 40, wherein the probe is an oligonucleotide.
 42. Amethod of claim 41, wherein the oligonucleotide is attached to a solidsupport.
 43. A method of claim 42, wherein the solid support is a chip.44. A method of claim 38, wherein the level of expression of one or moregenes in step (a) is detected by polymerase chain amplification (PCR).45. A method of claim 44, wherein the PCR is quantitative orsemi-quantitative.
 46. A method of claim 38, wherein step (a) comprisespreparing cDNA from polyA-RNA isolated from the tissue or cell sampleexposed to the toxin.
 47. A method of claim 46, wherein cRNA is preparedfrom the cDNA.
 48. A method of claim 38, wherein the tissue of cellsample is isolated from a dog.
 49. A method of claim 38, wherein thetissue or cell sample is in vitro cultured.
 50. A computer systemcomprising: (a) a database of a set of genes comprising at least onegene comprising SEQ ID NOS: 1-11,109; and (b) a user interface to viewthe information.
 51. A computer system of claim 50, wherein the databasefurther comprises information identifying the expression level for saidat least one gene in a tissue or cell sample from a canine tissue orcell sample exposed to a toxin.
 52. A computer system of claim 51,wherein the database further comprises information identifying theexpression level for said at least one gene in the tissue or cell samplebefore exposure to the toxin.
 53. A computer system of claim 52, whereinthe database further comprises information identifying the expressionlevel of any one of SEQ ID NOS: 1-11,109 in toxin-exposed or normalliver, kidney, heart, brain, or testicular tissue.
 54. A computer systemof claim 51, wherein the database further comprises informationidentifying the expression level for said at least one gene in a tissueor cell sample exposed to at least a second toxin.
 55. A computer systemof claim 50, further comprising records including descriptiveinformation from an external database, which information correlates saidgenes to records in the external database.
 56. A computer system ofclaim 55, wherein the external database is GenBank.
 57. A method ofusing a computer system of claim 50 to present information identifyingthe expression level in a tissue or cell sample of at least one genecomprising SEQ ID NOS: 1-11,109, comprising: (a) comparing theexpression level of at least one gene in a tissue or cell exposed to atest agent to the level of expression of the gene in the database.
 58. Amethod of claim 57, wherein the expression levels of at least about 100genes are compared.
 59. A method of claim 57, wherein the expressionlevels of at least about 1000 genes are compared.
 60. A method of claim57, wherein the expression levels of nearly all of the genes arecompared.
 61. A method of claim 57, wherein the expression levels of allof the genes are compared.
 62. A method of claim 57, further comprising:(b) displaying the level of expression of at least one gene in thetissue or cell sample compared to the expression level when exposed to atoxin.
 63. A kit comprising at least one solid support of claim
 12. 64.A kit of claim 63, further comprising sequence or gene expressioninformation for the genes.
 65. A kit of claim 64, wherein the geneexpression information comprises gene expression levels in a tissue orcell sample exposed to a toxin.
 66. An oligonucleotide probe or primerthat specifically hybridizes to a nucleic acid molecule comprisinggreater than 90% sequence identity across greater than 90% of the lengthof any one of SEQ ID NOS: 1-11,109.